Informal version for personal use Scalable Clustering
نویسندگان
چکیده
2 Clustering Techniques: A Brief Survey 4 2.1 Partitional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Hierarchical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Discriminative vs. Generative Models . . . . . . . . . . . . . . . . . 12 2.4 Assessment of Results . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.1 Internal (model-based, unsupervised) Quality . . . . . . . . . 13 2.4.2 External (model-free, semi-supervised) Quality . . . . . . . . 14 2.5 Visualization of Results . . . . . . . . . . . . . . . . . . . . . . . . . 16
منابع مشابه
Merging Similarity and Trust Based Social Networks to Enhance the Accuracy of Trust-Aware Recommender Systems
In recent years, collaborative filtering (CF) methods are important and widely accepted techniques are available for recommender systems. One of these techniques is user based that produces useful recommendations based on the similarity by the ratings of likeminded users. However, these systems suffer from several inherent shortcomings such as data sparsity and cold start problems. With the dev...
متن کاملUsing fuzzy c-means clustering algorithm for common lecturer timetabling among departments
University course timetabling problem is one of the hard problems and it must be done for each term frequently which is an exhausting and time consuming task. The main technique in the presented approach is focused on developing and making the process of timetabling common lecturers among different departments of a university scalable. The aim of this paper is to improve the satisfaction of com...
متن کاملScalable techniques for clustering the web pdf
Scalable Clustering.and text mining, spatial database applications, Web analysis, CRM, marketing. Powerful broadly applicable data mining clustering methods surveyed below. Since scalability is the major achievement of this blend strategy, this algorithm is.Using typical document clustering techniques on Web opinions produce unsatisfying result. In this work, we propose the scalable distance-ba...
متن کاملخوشهبندی دادهها بر پایه شناسایی کلید
Clustering has been one of the main building blocks in the fields of machine learning and computer vision. Given a pair-wise distance measure, it is challenging to find a proper way to identify a subset of representative exemplars and its associated cluster structures. Recent trend on big data analysis poses a more demanding requirement on new clustering algorithm to be both scalable and accura...
متن کاملExploiting parallelism to support scalable hierarchical clustering
A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our par...
متن کامل